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Quantum theory makes the most accurate empirical predictions and yet it lacks simple, comprehensible phys- 
ical principles from which the theory can be uniquely derived. A broad class of probabilistic theories exist 
which all share some features with quantum theory, such as probabilistic predictions for individual outcomes 
(indeterminism), the impossibility of information transfer faster than speed of light (no- signaling) or the im- 
possibility of copying of unknown states (no-cloning). A vast majority of attempts to find physical principles 
behind quantum theory either fall short of deriving the theory uniquely from the principles or are based on ab- 
stract mathematical assumptions that require themselves a more conclusive physical motivation. Here, we show 
that classical probability theory and quantum theory can be reconstructed from three reasonable axioms: (1) (In- 
formation capacity) All systems with information carrying capacity of one bit are equivalent. (2) (Locality) The 
state of a composite system is completely determined by measurements on its subsystems. (3) (Reversibility) 
Between any two pure states there exists a reversible transformation. If one requires the transformation from the 
last axiom to be continuous, one separates quantum theory from the classical probabilistic one. A remarkable 
result following from our reconstruction is that no probability theory other than quantum theory can exhibit 
entanglement without contradicting one or more axioms. 



I. INTRODUCTION 

The historical development of scientific progress teaches us 
that every theory that was established and broadly accepted at 
a certain time was later inevitably replaced by a deeper and 
more fundamental theory of which the old one remains a spe- 
cial case. One celebrated example is Newtonian (classical) 
mechanics which was superseded by quantum mechanics at 
the beginning of the last century. It is natural to ask whether 
in a similar manner there could be logically consistent theo- 
ries that are more generic than quantum theory itself. It could 
then turn out that quantum mechanics is an effective descrip- 
tion of such a theory, only valid within our current restricted 
domain of experience. 

At present, quantum theory has been tested against very 
specific alternative theories that, both mathematically and in 
their concepts, are distinctly different. Instances of such al- 
ternative theories are non-contextual hidden-variable theo- 
ries Od, local hidden-variable theories [2], crypto-nonlocal 
hidden-variable theories or some nonlinear variants 

of the Schrodinger equation ||5l 0, 0, Si. Currently, many 
groups are working on improving experimental conditions to 
be able to test alternative theories based on various collapse 
models common trait of all these pro- 

posals is to suppresses one or the other counter-intuitive fea- 
ture of quantum mechanics and thus keep some of the basic 
notions of a classical world view intact. Specifically, hidden- 
variable models would allow to preassign definite values to 
outcomes of all measurements, collapse models are mecha- 
nisms for restraining superpositions between macroscopically 
distinct states and nonlinear extensions of the Schrodinger 
equation may admit more localized solutions for wave-packet 
dynamics, thereby resembling localized classical particles. 

In the last years the new field of quantum information has 
initialized interest in generalized probabilistic theories which 
share certain features - such as the no-cloning and the no- 
broadcasting theorems iflU [l5ll or the trade-off between state 



disturbance and measurement [16] - generally thought of as 
specifically quantum, yet being shown to be present in all 
except classical theory. These generalized probabilistic the- 
ories can allow for stronger than quantum correlations in the 
sense that they can violate Bell's inequalities stronger than the 
quantum Cirel'son bound (as it is the case for the celebrated 
"non-local boxes" of Popescu and Rohrlich fnlp . though they 
all respect the "non-signaling" constraint according to which 
correlations cannot be used to send information faster than the 
speed of light. 

Since the majority of the features that have been highlighted 
as "typically quantum" are actually quite generic for all non- 
classical probabilistic theories, one could conclude that addi- 
tional principles must be adopted to single out quantum the- 
ory uniquely. Alternatively, these probabilistic theories indeed 
can be constructed in a logically consistent way, and might 
even be realized in nature in a domain that is still beyond 
our observations. The vast majority of attempts to find phys- 
ical principles behind quantum theory either fail to single out 
the theory uniquely or are based on highly abstract mathe- 
matical assumptions without an immediate physical meaning 
(e.g. id). 

On the way to reconstructions of quantum theory from 
foundational physical principles rather than purely mathemat- 
ical axioms, one finds intere sting examples coming from an 
instrumentalist approach lfl9ll20ll2Tll . where the focus is pri- 
marily on primitive laboratory operations such as prepara- 
tions, transformations and measurements. While these recon- 
structions are based on a short set of simple axioms, they still 
partially use mathematical language in their formulation. 

Evidentally, added value of reconstructions for better un- 
derstanding quantum theory originates from its power of ex- 
planation where the structure of the theory comes from. Can- 
didates for foundational principles were proposed giving a ba- 
sis for an understanding of quantum theory as a general theory 
of information supplemented by several information-theoretic 
constraints I22I l23l l24l |25[ |2^]. In a wider context these ap- 



proaches belong to attempts to find an explanation for quan- 
tum theory by putting primacy on the concept of information 
or on the concept of probability whic h again can be seen as 
a way of quantifying information J27l l28l l29l [30L l3ll HH HH 
134, l35l 13611 . Other principles were proposed for separation of 
quantum correlations from general non-signaling correlations, 
such as that communication complexity is not trivial H^HH], 
that communication of m classical bits causes information 
gain of at most m bits ("information causality") rt39ll . or that 
any theory should recover classical physics in the macroscopic 
limit ||40Tl. 

In his seminal paper, Hardy |[l9ll derives quantum the- 
ory from "five reasonable axioms" within the instrumentalist 
framework. He sets up a link between two natural numbers, 
d and N, characteristics of any theory, d is the number of de- 
grees of freedom of the system and is defined as the minimum 
number of real parameters needed to determine the state com- 
pletely. The dimension N is defined as the maximum number 
of states that can be reliably distinguished from one another in 
a single shot experiment. A closely related notion is the infor- 
mation carrying capacity of the system, which is the maximal 
number of bits encoded in the system, and is equal to log N 
bits for a system of dimension N. 

Examples of theories with an explicit functional depen- 
dence d(N) are classical probability theory with the linear de- 
pendence d = N — 1, and quantum theory with the quadratic 
dependence for which it is necessary to use d = N 2 - 1 real 
parameters to completely characterize the quantum state Ic35ll . 
Higher-order theories with more general dependencies ii(A0 
might exist as illustrated in Figure 1. Hardy's reconstruc- 
tion resorts to a "simplicity axiom" that discards a large class 
of higher-order theories by requiring that for each given N, 
d(N) takes the minimum value consistent with the other ax- 
ioms. However, without making such an ad hoc assumption 
the higher-order theories might be possible to be constructed 
in agreement with the rest of the axioms. In fact, an explicit 
quartic theory for which d - N A — \ l4lll . and theories for 
generalized bit (N = 2) for which d = 2 r - 1 and r e N JH, 
were recently developed, though all of them are restricted to 
the description of individual systems only. 

It is clear from the previous discussion that the question on 
basis of which physical principles quantum theory can be sep- 
arated from the multitude of possible generalized probability 
theories is still open. A particulary interesting unsolved prob- 
lem is whether the higher-order theories of Refs. iflsii |4U l42tl 
can be extended to describe non-trivial, i.e. entangled, states 
of composite systems. Any progress in theoretical under- 
standing of these issues would be very desirable, in particular 
because experimental research efforts in this direction have 
been very sporadic. Although the majority of experiments in- 
directly verify also the number of the degrees of freedom of 
quantum systems ll66ll . there are only few dedicated attempts 
at such a direct experimental verification. Quaternionic quan- 
tum mechanics (for which d = 2N 2 - N - 1) was tested in 
a suborjtimal setting l45ll in a single neutron experiment in 
1984 043114411 . and more recently, the generalized measure the- 
ory of Sorkin B46I1 in which higher order interferences are pre- 
dicted was tested in a three-slit experiment with photons IHTIl . 
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FIG. 1: State spaces of a two-dimensional system in the generalized 
probabilistic theories analyzed here, d is the minimal number of real 
parameters necessary to determine the (generally mixed) state com- 
pletely. From left to right: A classical bit with one parameter (the 
weight p in the mixture of two bit values), a real bit with two real pa- 
rameters (statep 6 D(R 2 ) is represented by 2x2 real density matrix), 
a qubit (quantum bit) with three real parameters (state p e D(C 2 ) is 
represented by 2 x 2 complex density matrix) and a generalized bit 
for which d real parameters are needed to specify the state. Note 
that, when one moves continuously from one pure state (represented 
by a point on the surface of a sphere) to another, only in the classical 
probabilistic theory one must go trough the set of mixed states. Can 
probability theories that are more generic than quantum theory be ex- 
tended in a logically consistent way to higher-dimensional and com- 
posite systems? Can entanglement exist in these theories? Where 
should we look in nature for potential empirical evidences of the the- 
ories? 

Both experiments put an upper bound on the extent of the ob- 
servational effects the two alternative theories may produce. 

II. BASIC IDEAS AND THE AXIOMS 

Here we reconstruct quantum theory from three reasonable 
axioms. Following the general structure of any reconstruc- 
tion we first give a set of physical principles, then formulate 
their mathematical representation, and finally rigorously de- 
rive the formalism of the theory. We will only consider the 
case where the number of distinguishable states is finite. The 
three axioms which separate classical probability theory and 
quantum theory from all other probabilistic theories are: 

Axiom 1. (Information capacity) An elementary system has 
the information carrying capacity of at most one bit. All sys- 
tems of the same information carrying capacity are equiva- 
lent. 

Axiom 2. (Locality) The state of a composite system is com- 
pletely determined by local measurements on its subsystems 
and their correlations. 

Axiom 3. (Reversibility) Between any two pure states there 
exists a reversible transformation. 

A few comments on these axioms are appropriate here. The 
most elementary system in the theory is a two-dimensional 
system. All higher-dimensional systems will be built out of 
two-dimensional ones. Recall that the dimension is defined 
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y = Xx l + (1 — yl)jc 2 

= T]X + (1 - T])X ± 



FIG. 2: Illustration of the assumption stated in axiom 1. Consider 
a toy-world of a two-dimensional system in which the set of pure 
states consists of only Xi and xi and their orthogonal states xf and 
xf respectively, and where only two measurements exist, which dis- 
tinguish (xi , xf j and (X2, xf}. The convex set (represented by the grey 
area within the circle) whose vertices are the four states contains all 
physical (pure or mixed) states in the toy-world. Now, choose a point 
in the set, say y = Ax\ + (1 - A)X2. Axiom 1 states that any phys- 
ical state can be represented as a mixture of two orthogonal states 
(i.e. states perfectly distinguishable in a single shot experiment), e.g. 
y = r;x+(l-ri)x ± . This is not fulfilled in the toy world, but is satisfied 
in a theory in which the entire circle represents the pure states and 
where measurements can distinguish all pairs of orthogonal states. 



as the maximal number of states that can be reliably distin- 
guished from one another in a single shot experiment. Under 
the phrase "an elementary system has an information capac- 
ity of at most one bit" we precisely assume that for any state 
(pure or mixed) of a two-dimensional system there is a mea- 
surement such that the state is a mixture of two states which 
are distinguished reliable in the measurement. An alternative 
formulation could be that any state of a two dimensional sys- 
tem can be prepared by mixing at most two basis (i.e. per- 
fectly distinguishable in a measurement) states (see Figure 2). 
Roughly speaking, axiom 1 assumes that a state of an elemen- 
tary system can always be represented as a mixture of two 
classical bits. This part of the axiom is inspired by Zeilinger's 
proposal for a foundation principle for quantum theory 02311 . 

The second statement in axiom 1 is motivated by the in- 
tuition that at the fundamental level there should be no dif- 
ference between systems of the same information carrying 
capacity. All elementary systems - be they part of higher 
dimensional systems or not - should have equivalent state 
spaces and equivalent sets of transformations and measure- 
ments. This seems to be a natural assumption if one makes 
no prior restrictions to the theory and preserves the full sym- 
metry between all possible elementary systems. This is why 
we have decided to put the statement as a part of axiom 1, 
rather than as a separate axiom. The particular formulation 
used here is from Grinbaum B48I1 who suggested to rephrase 
the "subspace axiom" of Hardy's reconstruction using physi- 
cal language rather than mathematical. The subspace axiom 
states that a system whose state is constrained to belong to an 
M dimensional subspace (i.e. have support on only M of a set 
of N possible distinguishable states) behaves like a system of 
dimension M. 

In logical terms axiom 1 means the following. We can think 
of two basis states as two binary propositions about an indi- 



vidual system, such as (1) "The outcome of measurement A is 
+ 1" and (2) "The outcome of measurement A is -1". An alter- 
native choice for the pair of propositions can be propositions 
about joint properties of two systems, such as (1') "The out- 
comes of measurement A on the first system and of B on the 
second system are correlated" (i.e. either both +1 or both -1) 
and (2') "The outcomes of measurement A on the first system 
and of B on the second system are anticorrelated". The two 
choices for the pair of propositions correspond to two choices 
of basis states which each can be used to span the full state 
space of an abstract elementary system (also called "general- 
ized bit"). As we will see later, taking the latter choice, it will 
follow from axiom 1 alone that the state space must contain 
entangled states. 

Axiom 2 assumes that a specification of the probabilities for 
a complete set of local measurements for each of the subsys- 
tems plus the joint probabilities for correlations between these 
measurements is sufficient to determine completely the global 
state. Note that this property does hold in both quantum the- 
ory and classical probability theory, but not in quantum theory 
formulated on the basis of real or quaternionic amplitudes in- 
stead of complex. A closely related formulation of the axiom 
was given by Barrett [16]. 

Finally, axiom 3 requires that transformations are re- 
versible. This is assumed alone for the purposes that the set 
of transformations builds a group structure. It is natural to 
assume that a composition of two physical transformations is 
again a physical transformation. It should be noted that this 
axiom could be used to exclude the theories in which "non- 
local boxes" occur, because there the dynamical group is triv- 
ial, in the sense that it is generated solely by local operations 
and permutations of systems with no entangling reversible 
transformations (that is, non-local boxes cannot be prepared 
from product states) [49]. 

If one requires the reversible transformation from our ax- 
iom 3 to be continuous: 

Axiom 3'. ( Continuity) Between any two pure states there ex- 
ists a continuous reversible transformation, 

which separates quantum theory from classical probability 
theory. The same axiom is also present in Hardy's reconstruc- 
tion. By a continuous transformation is here meant that every 
transformation can be made up from a sequence of transfor- 
mations only infinitesimally different from the identity. 

A remarkable result following from our reconstruction is 
that quantum theory is the only probabilistic theory in which 
one can construct entangled states and fulfill the three axioms. 
In particular, in the higher-order theories of Refs. lfl9l l4lll42Tl 
composite systems can only enjoy trivial separable states. On 
the other hand, we will see that axiom 1 alone requires en- 
tangled states to exist in all non-classical theories. This will 
allow us to discard the higher-order theories in our reconstruc- 
tion scheme without invoking the simplicity argument. 

As a by product of our reconstruction we will be able to 
answer why in nature only "odd" correlations (i.e. (1,1,-1), 
(1, -1, 1), (-1, 1, 1) and (-1, -1, -1)) are observed when two 
maximally entangled qubits (spin- 1/2 particles) are both mea- 
sured along direction x, y and z, respectively. The most famil- 
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iar example is of the singlet state = 2(|0)i|l)2 - 1 0> 1 1 1 >2 ) 
with anticorrelated results for arbitrarily but the same choice 
of measurement directions for two qubits. We will show that 
the "mirror quantum mechanics" in which only "even" corre- 
lations appear cannot be extended consistently to composite 
systems of three bits. 

Our reconstruction will be given in the framework of typi- 
cal experimental situation an observer faces in the laboratory. 
While this instrumentalist approach is a useful paradigm to 
work with, it might not be necessary. One could think about 
axioms 1 and 3 as referring to objective features of elemen- 
tary constituents of the world which need not necessarily be 
related to laboratory actions. In contrast, axiom 2 seems to 
acquire a meaning only within the instrumentalist approach as 
it involves the word "measurement". Even here one could fol- 
low a suggestion of Grinbaum [48] and rephrase the axiom to 
the assumption of "multiplicability of the information carry- 
ing capacity of subsystems." 

Concluding this section, we note that the conceptual 
groundwork for the ideas presented here has been pre- 
pared most notably by Weizsacker ifsoh . Wheeler ll5lll and 
Zeilinger 12311 who proposed that the notion of the elemen- 
tary yes-no alternative, or the "Ur", should play a pivotal role 
when reconstructing quantum physics. 



IIL BASIC NOTIONS 

Following Hardy 11 1 9TI we distinguish three types of devices 
in a typical laboratory. The preparation device prepares sys- 
tems in some state. It has a set of switches on it for varying 
the state produced. After state preparation the system passes 
through a transformation device. It also has a set of switches 
on it for varying the transformation applied on the state. Fi- 
nally, the system is measured in a measurement apparatus. It 
again has switches on it with which help an experimenter can 
choose different measurement settings. This device outputs 
classical data, e.g. a click in a detector or a spot on a observa- 
tion screen. 

We define the state of a system as that mathematical object 
from which one can determine the probability for any conceiv- 
able measurement. Physical theories can have enough struc- 
ture that it is not necessary to give an exhaustive list of all 
probabilities for all possible measurements, but only a list of 
probabilities for some minimal subset of them. We refer to 
this subset as fiducial set. Therefore, the state is specified by 
a list of d (where d depends on dimension N) probabilities for 
a set of fiducial measurements: p = (p\, .. , ,pi). The state is 
pure if it is not a (convex) mixture of other states. The state 
is mixed if it is not pure. For example, the mixed state p gen- 
erated by preparing state pi with probability A and P2 with 
probability 1 — A, is p = Api + (1 - /1)P2- 

When we refer to an iV-dimensional system, we assume that 
there are N states each of which identifies a different outcome 
of some measurement setting, in the sense that they return 
probability one for the outcome. We call this set a set of basis 
or orthogonal states. Basis states can be chosen to be pure. To 
see this assume that some mixed state identifies one outcome. 



We can decompose the state into a mixture of pure states, each 
of which has to return probability one, and thus we can use 
one of them to be a basis state. We will show later that each 
pure state corresponds to a unique measurement outcome. 

If the system in state p is incident on a transformation de- 
vice, its state will be transformed to some new state C/(p). The 
transformation U is a linear function of the state p as it needs 
to preserve the linear structure of mixtures. For example, con- 
sider the mixed state p which is generated by preparing state 
P! with probability A and P2 with probability 1 - A. Then, in 
each single run, either pi or P2 is transformed and thus one 
has: 

U(Api + (1 - A)p 2 ) = AUfa) + (1 - A)U(p 2 ). (1) 

It is natural to assume that a composition of two or more 
transformations is again from a set of (reversible) transforma- 
tions. This set forms some abstract group. Axiom 3 states that 
the transformations are reversible, i.e. for every U there is an 
inverse group element U , Here we assume that every trans- 
formation has its matrix representation U and that there is an 
orthogonal representation of the group: there exists an invert - 
ible matrix S such that O = SUS~ l is an orthogonal matrix, 
i.e. T = 1, for every U (We use the same notation both for 
the group element and for its matrix representation). This does 
not put severe restrictions to the group of transformations, as 
it is known that all compact groups have such a representa- 
tion (the Schur-Auerbach lemma) ll55ll . Since the transforma- 
tion keeps the probabilities in the range [0, 1], it has to be a 
compact group [ 19]. All finite groups and all continuous Lie 
groups are therefore included in our consideration. 

Given a measurement setting, the outcome probability 
Aneas can be computed by some function / of the state p, 

fmeas = /(p)- (2) 

Like a transformation, the measurement cannot change the 
mixing coefficients in a mixture, and therefore the measured 
probability is a linear function of the state p: 

f(A Pl + (1 - A)p 2 ) = Af( Vl ) + (1 - A)f(p 2 ). (3) 

IV. ELEMENTARY SYSTEM: SYSTEM OF 
INFORMATION CAPACITY OF 1 BIT 

A two-dimensional system has two distinguishable out- 
comes which can be identified by a pair of basis states {p, p- 1 }. 
The state is specified by d probabilities p = (p\, ..^pd) for d 
fiducial measurements, where is probability for a particu- 
lar outcome of the z'-th fiducial measurement (the dependent 
probabilities 1 - /?,■ for the opposite outcomes are omitted in 
the state description). Instead of using the probability vector p 
we will specify the state by its Block representation x defined 
as a vector with d components: 

x, = 2 Pi -l. (4) 

The mapping between the two different representations is an 
invertible linear map and therefore preserves the structure of 
the mixture /Ipi + (1 - /l)p2 i-> Ax\ + (1 - A)x 2 . 
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It is convenient to define a totaly mixed state E 
— Yjxzs x ' wnere >5pure denotes the set of pure states and 
N is the normalization constant. In the case of a continu- 
ous set of pure states the summation has to be replaced by a 
proper integral. It is easy to verify that E is a totally invariant 
state. This implies that every measurement and in particular 
the fiducial ones will return the same probability for all out- 
comes. In the case of a two-dimensional system this proba- 
bility is 1 /2. Therefore, the Bloch vector of the totally mixed 
state is the zero-vector E = 0. 

The transformation U does not change the totaly mixed 
state, hence t/(0) = 0. The last condition together with the 
linearity condition ([TJ implies that any transformation is rep- 
resented by some d X d real invertible matrix U. The same 
reasoning holds for measurements. Therefore, the measured 
probability is given by the formula: 

^meas = ^(1 +r T x). (5) 

The vector r represents the outcome for the given measure- 
ment setting. For example, the vector (1, 0, 0, . . .) represents 
one of the outcomes for the first fiducial measurement. 

According to axiom 1 any state is a classical mixture of 
some pair of orthogonal states. For example, the totally mixed 
state is an equally weighted mixture of some orthogonal states 
= ix + ix" 1 . Take x to be the reference state. According to 
axiom 3 we can generate the full set of states by applying all 
possible transformations to the reference state. Since the to- 
tally mixed state is invariant under the transformations, the 
pair of orthogonal states is represented by a pair of antipar- 
allel vectors x x = -x. Consider the set S pme = { Ux | Vt/} 
of all pure states generated by applying all transformations to 
the reference state. If one uses the orthogonal representation 
of the transformations, U = S~ l OS, which was introduced 
above, one maps x i-> Sx and U h-> O. Hence, the transfor- 
mation Ux i-> S Ux = OSx is norm preserving. We conclude 
that all pure states are points on a <i-dimensional ellipsoid de- 
scribed by \\Sx\\ — c with c > 0. 

Now, we want to show that any vector x satisfying ||5x|| = c 
is a physical state and therefore the set of states has to be the 
whole ellipsoid. Let x be some vector satisfying ||5x|| = c and 
x(f) = fx a line trough the origin (totaly mixed state) as given 
in Figure [3](left). Within the set of pure states we can always 
find d linearly independent vectors {xi, . . . , X^}. For each state 
x, there is a corresponding orthogonal state xf = -x, in a set 
of states. We can expand a point on the line into a linearly 
independent set of vectors: x(f) = t Yii=\ c ; x /- For sufficiently 
small t we can define a pair of non-negative numbers A;(t) = 
i(i + t Ci ) and Af(t) = i(i - t Ci ) with + #(0) = 1 

such that x(f) is a mixture x(f) = Yfi=\ ^;(0 X ; + Af(t)xf and 
therefore is a physical state. Then, according to axiom 1 there 
exists a pair of basis states {xo, -xo) such that x(f) is a mixture 
of them 

x(f) = t x = axo + (1 - or)(-xo), (6) 

where a = and x = Xq. This implies that x is a pure state 
and therefore all points of the ellipsoid are physical states. 




FIG. 3: (Left) Illustration to the proof that the entire li-dimensional 
ellipsoid (here represented by a circle; d = 2) contains physical 
states. Consider a line x(r) = tx through the origin. A point on 
the line can be expanded into a set of linearly independent vectors X; 
(here Xi and X2). For sufficiently small t (i.e. when the line is within 
the gray square) the point x(t) can be represented as a convex mixture 
over Xj and their orthogonal vectors x^ and thus is a physical state. 
According to axiom 1, x(r) can be represented as a convex mixture of 
two orthogonal pure states xo and x^ : x(f) = tx = axo + (1 - or)(-xo), 
where x = x (see text for details). This implies that every point in 
the ellipsoid is a physical state. (Right) Illustration to the proof that in 
the orthogonal representation the measurement vector m that identi- 
fies the state x, i.e. for which the probability P mea s = 4(1 + m T y) = 1 , 
is identical to the state vector, m = y. Suppose that m t y, then 
||m|| > 1, since the state vector is normalized. But then the same 
measurement for state y' parallel to m would return a probability 
larger than 1, which is nonphysical. Thus m = y. 



For every pure state x, there exists at least one measurement 
setting with the outcome r such that the outcome probability 
is one, hence r T x = 1. Let us define new coordinates y = -Sx 

c 

and m = cS~ 1T r in the orthogonal representation. The set of 
pure states in the new coordinates is a (d - l)-sphere S d ~ l = 
{y I llyll = 1} of the radius. The probability rule (0 remains 
unchanged in the new coordinates: 

^meas= i(l+m T y). (7) 

Thus, one has m T y — 1. Now, assume that m + y. Then 
|[m|[ > 1 and the vectors m and y span a two-dimensional 
plane as illustrated in Figure [3] (right). The set of pure states 
within this plane is a unit circle. Choose the pure state y' to 
be parallel to m. Then the outcome probability is P me asur 

I 1 + ||m||||y'||) > 1 which is non-physical, hence m = y. 
Therefore, to each pure state y, we associate a measurement 
vector m = y which identifies it. Equivalently, in the original 
coordinates, to each x we associate a measurement vector r = 
Dx, where D = jiS T S is a positive, symmetric matrix. A 
proof of this relation for the restricted case of d = 3 can be 
found inRef. JH]. 

From now one, instead of the measurement vector r we will 
use the pure state x which identifies it. When we say that the 
measurement along the state x is performed we mean the mea- 
surement given by r = Dx. The measurement setting is given 
by a pair of measurement vectors r and -r. The measured 
probability when the state xi is measured along the state X2 
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follows from formula (0: 

P(x u x 2 ) = -(l+xlDx 2 ). (8) 

We can choose orthogonal eigenvectors of the matrix D as 
the fiducial set of states (measurements): 

Dxj = flfXf, (9) 

where a, are eigenvalues of D. Since x, are pure states, they 
satisfy xfDxj = Sy. The set of pure states becomes a unit 
sphere S d ~ l = {x | ||x|| — 1} and the probability formula is 
reduced to 

P( Xl ,x 2 ) = i(l+x|x 2 ). (10) 

This corresponds to a choice of a complete set of mutually 
complementary measurements (i.e. mutually unbiased basis 
sets) for the fiducial measurements. The states identifying out- 
comes of complementary measurements satisfy P(x,-,x ; ) = j 
for i + j. Two observables are said to be mutually comple- 
mentary if complete certainty about one of the observables 
(one of two outcomes occurs with probability one) precludes 
any knowledge about the others (the probability for both out- 
comes is 1 /2). Given some state x, the r-th fiducial measure- 
ment returns probability pi - j(l +x,). Therefore, x/ is a mean 
value of a dichotomic observable b, = +lx, - Ixf with two 
possible outcomes b, = ±1. 

A theory in which the state space of the generalized bit 
is represented by a (d — l)-sphere has d mutually comple- 
mentary observables. This is a characteristic feature of the 
theories and they can be ordered according to their number. 
For example, classical physics has no complementary observ- 
ables, real quantum mechanics has two, complex (standard) 
quantum mechanics has three (e.g. the spin projections of 
a spin- 1/2 system along three orthogonal directions) and the 
one based on quaternions has five mutually complementary 
observables. Note that higher-order theories of a single gen- 
eralized bit are such that the qubit theory can be embedded in 
them in the same way in which classical theory of a bit can be 
embedded in qubit theory itself. 

Higher-order theories can have even better information pro- 
cessing capacity than quantum theory. For example, the com- 
putational abilities of the theories with d — 2 r and r e N 
in solving the Deutsch-Josza type of problems increases with 
the number of mutually complementary measurements l42tl . 
It is likely that the larger this number is the larger the error 
rate would be in secret key distribution in these theories, in a 
similar manner in which the 6-state is advantageous over the 
4-state protocol in (standard) quantum mechanics. In the first 
case one uses all three mutually complementary observables 
and in the second one only two of them. (See Ref. 15211 for a 
review on characterizing generalized probabilistic theories in 
terms of their information-processing power and Ref. 15 311 for 
investigating the same question in much more general frame- 
work of compact closed categories.) 

A final remark on higher-order theories is of more specula- 
tive nature. In various approaches to quantum theory of grav- 



ity one predicts at the Planck scale the dimension of space- 
time to be different from 3 + 1 115411 . If one considers direc- 
tional degrees of freedom (spin), then the d - 1 -sphere (Bloch 
sphere) might be interpreted as the state space of a spin system 
embedded in real (ordinary) space of dimension d, in general 
different than 3 which is the special case of quantum theory. 

The reversible transformation R preserves the purity of state 
\\Rx\\ — \\x\\ and therefore R is an orthogonal matrix. We have 
shown that the state space is the full (d - l)-sphere. Accord- 
ing to axiom 3 the set of transformations must be rich enough 
to generate the full sphere. If d — 1 (classical bit), the group 
of transformations is discrete and contains only the identity 
and the bit-flip. If d > 1, the group is continuous and is some 
subgroup of the orthogonal group O(d). Every orthogonal ma- 
trix has determinant either 1 or -1. The orthogonal matrices 
with determinant 1 form a normal subgroup of O(d), known 
as the special orthogonal group SO(ii). The group O(d) has 
two connected components: the identity component which 
is the SO(d) group, and the component formed by orthogo- 
nal matrices with determinant -1. Since every two points on 
the (d - l)-sphere are connected by some transformation, the 
group of transformations is at least the SO{d) group. If we in- 
clude even a single transformation with determinant - 1 , the set 
of transformations becomes the entire 0(d) group. (Later we 
will show that only some d are in agreement with our three 
axioms and for these d's the set of physical transformations 
will be shown to be the SO(ii) group). 



V. COMPOSITE SYSTEM AND THE NOTION OF 
LOCALITY 

We now introduce a description of composite systems. We 
assume that when one combines two systems of dimension L\ 
and L2 into a composite one, one obtains a system of dimen- 
sion L\L,2. Consider a composite system consisting of two 
geneneralized bits and choose a set of d complementary mea- 
surements on each subsystem as fiducial measurements. Ac- 
cording to axiom 2 the state of the composite system is com- 
pletely determined by a set of real parameters obtainable from 
local measurements on the two generalized bits and their cor- 
relations. We obtain 2a! independent real parameters from the 
set of local fiducial measurements and additional d 2 param- 
eters from correlations between them. This gives altogether 
d 2 + 2d = (d + 1 ) 2 - 1 parameters. They are the components 
Xi, )>,-, i e {1, ...,d}, of the local Bloch vectors and Ty of the 
correlation tensor: 

Xi = p (i \A = 1) - p (i) (A = - 1 ), (11) 
y j = pU(B = l)-p (j \B = -l), (12) 
Tij = p (ij) (AB = 1) - p (iJ) (AB = -1). (13) 

Here, for example, p^'\A = 1) is the probability to obtain 
outcome A — 1 when the z'-th measurement is performed on 
the first subsystem and p (l] H.AB = 1) is the joint probability 
to obtain correlated results (i.e. either A = B = +lorA = 
B — -1) when the z'-th measurement is performed on the first 
subsystem and the j-th measurement on the second one. 
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Note that axiom 2 "The state of a composite system is com- 
pletely determined by local measurements on its subsystems 
and their correlations" is formulated in a way that the non- 
signaling condition is implicitly assumed to hold. This is 
because it is sufficient to speak about "local measurements" 
alone without specifying the choice of measurement setting 
on the other, potentially distant, subsystem. Therefore, x, does 
not depend on j, and yj does not depend on i. 

We represent a state by the triple if/ = (x, y, T), where x 
and y are the local Bloch vectors and T is a d x d real matrix 
representing the correlation tensor. The product (separable) 
state is represented by i// p = (x, y, T), where T = xy T is of 
product form, because the correlations are just products of the 
components of the local Bloch vectors. We call the pure state 
entangled if it is not a product state. 

The measured probability is a linear function of the state if/. 
If we prepare totaly mixed states of the subsystems (0, 0, 0), 
the probability for any outcome of an arbitrary measurement 
will be 1/4. Therefore, the outcome probability can be written 
as: 

^measur = ^(1 + (r, if/)), (14) 

where r = (ri,r 2 ,K) is a measurement vector associated to 
the observed outcome and (..., ...) denotes the scalar product: 

(r,i/f) = r T 1 x + rly + Tr(K T T). (15) 

Now, assume that r = (ri,T2,K) is associated to the 
outcome which is identified by some product state if/ p = 
(xo, yo, To). If we preform a measurement on the arbitrary 
product state if/ = (x, y, T), the outcome probability has to fac- 
torize into the product of the local outcome probabilities of 
the form ([10): 

P measui- = -(1 + r[x + r^y + x 1 Ky) (16) 
= Pi(xo,x)/> 2 (y ,y) (17) 

= I(l +X T X )i(l + yT y) (18) 

= 4 ( 1 + 4 x + yo y + xTx oyo y)> ( 1 9 ) 

which holds for all x, y. Therefore we have r — if/ p . For each 
product state if/ p there is a unique outcome r — if/ p which iden- 
tifies it. We will later show that correspondence r — if/ holds 
for all pure states if/. 

If we preform local transformations Ri and R 2 on the sub- 
systems, the global state if/ = (x, y, T) is transformed to 

{R x ,R z )il> = (R 1 x,R 2 y,R ] TR T 2 ). (20) 

T is a real matrix and we can find its singular value decompo- 
sition diag[?i, . . . , td] = R\TR^, where Ri,R 2 are orthogonal 
matrices which can be chosen to have determinant 1 . There- 
fore, we can choose the local bases such that correlation tensor 
T is a diagonal matrix: 

(R u Ri)(x, y, T) = (R x x, R 2 y, diagfo, . . . , t d J). (21) 



The last expression is called Schmidt decomposition of the 
state. 

The local Bloch vectors satisfy ||x||, ||y|| < 1 which implies 
a bound on the correlation ||T|] > 1 for all pure states. The 
following lemma identifies a simple entanglement witness for 
pure states. The proof of this and all subsequent lemmas is 
given in the Appendix. 

Lemma 1. The lower bound \\T\\ — 1 is saturated, if and only 
if the state is a product state T — xy T . 

Recall that for every transformation U we can find its or- 
thogonal representation U = SOS 1 (the Schur-Auerbach 
lemma), where S is an invertible matrix and O j O = 1. The 
matrix S is characteristic of the representation and should be 
the same for all transformations U. If we choose some local 
transformation U = {R\,R 2 ), U will be orthogonal and thus 
we can choose to set S - 1. The representation of transfor- 
mations is orthogonal, therefore they are norm preserving. By 
applying simultaneously all (local and non-local) transforma- 
tions U to some product state (the reference state) if/ and to the 
measurement vector which identifies it, r — if/, we generate the 
set of all pure states and corresponding measurement vectors. 
Since we have 1 = P(r - if/, if/) - P(Ur, Uif/), correspondence 
r = if/ holds for any pure state if/. Instead of the measurement 
vector r in formula (fT4l > we use the pure state which identifies 
it. If the state if/\ = (xi,yi, T\) is prepared and measurement 
along the state if/ 2 = (x 2 ,y 2 ,T 2 ) is performed, the measured 
probability is given by 

PaWuM = \a+ x}x 2 + y}y 2 + Tv{T]T 2 )). (22) 

The set of pure states obeys Pi 2 (if/,if/) = 1. We can define 
the normalization condition for pure states P\ 2 {if/, if/) - \(1 + 
l|x|| 2 + ||y|| 2 + im| 2 ) = 1 where ||r|| 2 = Tr(r T r). Therefore we 
have: 

l|x|| 2 + ||y|| 2 + ||r|| 2 = 3, (23) 

for all pure states. 

An interesting observation can be made here. Although 
seemingly axiom 2 does not imply any strong prior restric- 
tions to d, we surprisingly have obtained the explicit number 
3 in the normalization condition d23l ). As we will see soon this 
relation will play an important role in deriving d = 3 as the 
only non-classical solution consistent with the axioms. 

VI. THE MAIN PROOFS 

We will now show that only classical probability theory and 
quantum theory are in agreement with the three axioms. 

A. Ruling out the d even case 

Let us assume the total inversion Ex = -x being a physical 
transformation. Let if/ = (x, y, T) be a pure state of composite 
system. We apply total inversion to one of the subsystems 
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and obtain the state if/' = (E, l)(x, y, T) = (— x,y, —T). 
probability 



The 



1 
1 



JW.tfO = 7(i-N[ 2 + |[yll 2 -H7'|| 2 ) (24) 



rdlyll -i) 



(25) 



has to be nonnegative and therefore we have ||y|| = 1. Simi- 
larly, we apply (1,E) to if/ and obtain ||x|| = 1. Since the local 
vectors are of the unit norm we have ||r|| — 1 and thus, accord- 
ing to lemma 1, the state if/ is a product state. We conclude 
that no entangled states can exist if £ is a physical transfor- 
mation. As we will soon see, according to axiom 1 entangled 
states must exist. Thus, E cannot represent a physical trans- 
formation. We will now show that this implies that d has to be 
odd. Recall that the set of transformations is at least the SO(ii) 
group, d cannot be even since E would have unit determinant 
and would belong to SO(ii). d has to be odd in which case E 
has determinant -1. The set of physical transformations is the 
SO(d) group. 



B. Ruling out the d > 3 case. 

Let us define one basis set of two generalized bit product 
states: 



if/i = (ei,ei,r = eie{) 

1A2 = (~ei,-ei,Tb) 

if/ 3 = (~ei, ei, -Jo) 

if/ 4 = (ei,-ei,-7o) 



(26) 
(27) 
(28) 
(29) 



with ei = (1,0,..., 0) T . Now, we define two subspaces S 12 
and 534 spanned by the states \f/\,\f/2 and ^3, if/4, respectively. 
Axiom 1 states that these two subspaces behave like one-bit 
spaces, therefore they are isomorphic to the (d - l)-sphere 
S 12 - S 34 = S d ~ l . The state if/ belongs to S 12 if and only if 
the following holds: 

PnW,ifj{) + Pi^,H>2) = 1. (30) 

Since the \f/\, . . . , if/4 form a complete basis set, we have 

Pi 2 (<A,<A3) = 0, Pi 2 OA,iA4) = 0. (31) 

A similar reasoning holds for states belonging to the S 34 sub- 
space. Since the states if/ e S12 and if/' e S 34 are per- 
fectly distinguishable in a single shot experiment, we have 
Pii(^,^') - 0. Therefore, S 12 and S34 are orthogonal sub- 
spaces. 

Axiom 1 requires the existence of entangled states as it is 
apparent from the following Lemma 2. 

Lemma 2. The only product states belonging to S 12 are if/\ 
and if/2- 

We define a local mapping between orthogonal subspaces 
S 12 and 534. Let the state if/ = (x, y, T) e S 12, with x = 



(x\,X2, . . . , Xd) T and y = (y\,y2, • ■ ■ ,yd) T - Consider the one- 
bit transformation R with the property Rei = — ei. The local 
transformation of this type maps the state from S 12 to S 34 as 
shown by the following lemma: 

Lemma 3. If the state if/ e S 12, then if/' = (R, l)if/ € S 34 and 
if/" = (l,R)if/ e S 34 . 

Let us define Tf = (T n ,. . . , T id ) and Tf = (T u , T di ) T . 
The correlation tensor can be rewritten in two different ways: 



T = 



L 2 



L d > 



T = ( 1* 



Tf). (32) 



Consider now the case d > 3. We define local transforma- 
tions Rj flipping the first and z'-th coordinate and Rju flipping 
the first and y'-th, A:-th, and /-th coordinate with j + k + l + 1 . 
Let if) = (x, y, (Tj , . . . , T^ A) ) T ) belong to S 12. According to 
Lemma 2, the states if/, = (/?;, l)if/ and = (Rju, &-)if/ be- 
long to S 34, therefore Pn(if/, if/d = and Pn(if/, if/jki) = 0. We 
have: 



= JW.fc) 



1 — JCj + x| + • 



_^ + ... +x 2 + ||y||2 



_|i T W||2 ||tW|I 2 
11*1 II T 11*2 II 

1 - 2x 2 - 2x 2 - 2||T' 



l|T w || 2 + ■ 



+ IIT 



-Wi|2 



if: " 2 -2||T« A ' ) || 2 + ||x|| 2 + | 



2(2- 



pW||2 _ IIX^II 2 ) 



(33) 
(34) 
(35) 

:i 2 + fii 2 

(36) 



Similarly, we expand P\2(if/,if/jki) = and together with the 
last equation we obtain: 



+ IIT 



W||2 



x\ + Xj + x k + x t + ||Tj 



W||2 



= 2 
+ NT 



(A-),,2 



+ IIT 



(A-),,2 



-j " ■ " k 

Since this has to hold for all i, j, k, I we have: 

X2 = X3 = ■ • • = Xd = 

= T*f = 0. 



+ IIT 



(.v)„2 



(37) 
= 2. 



rpfA-) _ rp(A') _ 

A 2 ~ A 3 ~ 



(38) 
(39) 



We repeat this kind of reasoning for the transformations 
(I,/?,) and {t,R jU ) and obtain: 



y 2 +y 2 + IITfll 2 + l|Tf' ) || 2 = 2 



(40) 



Therefore, we have 



W,,2 



+ IIT 



(y)\\2 



+ ||Tf || 2 + HT^II 2 = 2. 



y2 = b = • • ■ = yd = 



0. 



(41) 
(42) 



The only non-zero element of the correlation tensor is T\i and 
it has to be exactly 1, since ||r|| > 1. This implies that if/ is a 
product state, furthermore if/ = if/\ or if/ = if/2. 

This concludes our proof that only the cases d = 1 and 
d = 3 are in agreement with our three axioms. To distinguish 
between the two cases, one can invoke the continuity axiom 
(3') and proceed as in the reconstruction given by Hardy 11911 . 



VII. "TWO" QUANTUM MECHANICS 

We now obtain two solutions for the theory of a composite 
system consisting of two bits in the case when d — 3. One 
of them corresponds to the standard quantum theory of two 
qubits, the other one to its "mirror" version in which the states 
are obtained from the ones from the standard theory by partial 
transposition. Both solutions are regular as far as one consid- 
ers composite systems of two bits, but the "mirror" one cannot 
be consistently constructed already for systems of three bits. 

Two conditions (f30b and Pit put the constraint to the form 
of if/: 



x\ - -y\, Tu = 1. 



(43) 



The subspace S n is isomorphic to the sphere S 2 . Let us 
choose if/ complementary to the one bit basis {if/\,if/2\ in S 12. 
We have P\ 2 (i//, if/\) = P\ 2 (if/, 1A2) = 1/2 and thus xj = y\ = 0. 
For simplicity we write if/ in the form: 



1 Tj 
T 



(44) 



with x = (x2,X3) T : 

(r 21 ,r 3 i) T andr = 



(T n , T 13 ) T , T, 



y = (y2,B) T , 

T 2 2 ^23 
732 733 

Let R(4>) be a rotation around the ei axis. This transforma- 
tion keeps S 12 invariant. Now, we show that the state if/ as 
given by equation (l44b cannot be invariant under local trans- 
formation (t,R(^J). To prove this by reductio ad absurdum 
suppose the opposite, i.e. that (l,R((p))if/ = if/. We have three 
conditions 



R(4>)y = y, Tj/? T (0) = Tj, 



TR l 



(45) 



which implies y = 0, T^ = and T — thus 



if/ = 



According to equations (l37T i and (l40b we can easily check that 
||x|| = 1, and thus iff is locally equivalent to the state: 
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' ' 




(0] 




' 1 (T 






X2 









T 2 
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, X3 , 




L J 
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(0) 




' 1 (T 
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r;oo 
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{ 1 J 




1 J 




I ^3 J 


/ 



-e 3 eT) and xi 



(47) 



(-e3,-ei,e3eT). The two 



Let^i — ( e3,ei, ^'"a^ — v *-j> *-n 

conditions P(if/',xi) ^ an d PW ,xi) ^ become 



1 , 

1 - ro = --jj > 



1 

4 (1 

i(i - 1 + n = ir^ > 



(48) 
(49) 



and thus T' 3 = 0. The normalization condition d23l gives = 
±1. The state if/ is not physical. This can be seen when one 
performs the rotation {R, 1) where 



R = 



V2 V2 U 

i -J- 

V2 V2 U 

1 ) 



(50) 
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FIG. 4: Correlations between results obtained in measurements of 
two bits in a maximal entangled (Bell's) state in standard quantum 
mechanics (Left) and "mirror quantum mechanics" (Right) along x, 
y and z directions. Why do we never see correlations as given in the 
table on the right? The opposite sign of correlations on the right and 
on the left is not a matter of convention or labeling of outcomes. If 
one can transport the two bits parallel to the same detector, one can 
distinguish operationally between the two types of correlations 15711 . 

The transformed correlation tensor has a component V2 
which is non-physical. Therefore, the transformation 
(l,R((p))if/ draws a full circle of pure states in a plane orthog- 
onal to if/ 1 within the subspace S 12. Similarly, the transforma- 
tion (R(4>), 1) draws the same set of pure states when applied 
to if/. Hence, for every transformation (l,R(<p{)) there exists a 
transformation (R{(p2), 1) such that (l,R(<f>i))i// = (R{4>2), t)if/. 
This gives us a set of conditions: 



R(<f> 2 )x 

R(<f>i)y 
R(<f>2)Tx 



x 

y 

T.v 



R(<f> 2 )T = TR (<f)i), 



which are fulfilled if x = y = T* 
diag[Ti, T2] ■ Equation ( 1371 ) gives T 2 = 
end up with two different solutions: 



Tj 



(51) 
(52) 
(53) 
(54) 
(55) 

T y = and T = 
1 and we finally 



(46) iA Q M = (0,0,diag[ 1,-1,1]) V ^ MQM = (0,0,diag[l,l,l]). 

(56) 

The first "M" in i^mqm stands for "mirror". The two so- 
lutions are incompatible and cannot coexist within the same 
theory. The first solution corresponds to the triplet state + of 
ordinary quantum mechanics. The second solution is a totally 
invariant state and has a negative overlap with, for example, 
the singlet state if/~ for which T = diag[-l, -1,-1]. That is, 
if the system were prepared in one of the two states and the 
other one were measured, the probability would be negative. 
Nevertheless, both solutions are regular at the level of two 
bits. The first belongs to ordinary quantum mechanics with 
the singlet in the "antiparallel" subspace S 34 and the second 
solution is "the singlet state in the parallel subspace" S i 2 . We 
will show that one can build the full state space, transforma- 
tions and measurements in both cases. The states from one 
quantum mechanics can be obtained from the other by partial 
transposition if/^ M = i^mqm- In particular, the four maximal 
entangled states (Bell states) from "mirror quantum mechan- 
ics" have correlations of the opposite sign of those from the 
standard quantum mechanics (see Figure 4). 

Now we show that the theory with "mirror states" is physi- 
cally inconsistent when applied to composite system of three 
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bits. Let us first derive the full set of states and transforma- 
tions for two qubits in standard quantum mechanics. We have 
seen that the state 1/fQM belongs to the subspace S 12, and fur- 
thermore, that it is complementary (within S 12) to the product 
states and (fr 2 . The totally mixed state within the S 12 sub- 
space is £12 = jiffi + ki/f2- The states and </<qm span one 
two-dimensional plane, and the set of pure states within this 
plane is a circle: 

iff(x) - £12 + cos x (t/q - £12) + sinx (i^qm - En) (57) 
= (cos x ei, cosx ei, diag[l, - sinx, sinx]). (58) 

We can apply a complete set of local transformations to the 
set iff(x) to obtain the set of all pure two-qubit states. Let us 
represent a pure state if/ = (x, y, T) by the 4x4 Hermitian 
matrix p: 



3 3 3 

j(l®l+^jf<£r i ®l+^y i l®^i+2 Tijcr^o-j), (59) 

ij=i 



i=l 



1=1 



where cr,, i e {1,2,3}, are the three Pauli matrices. It is 
easy to show that the set of states d57l ) corresponds to the 
set of one-dimensional projectors \i//(x))(i//(x)\, where \4>{x)) — 
cos 1 1 00) + sin||ll). The action of local transformations 
(Ri,R2)(ff corresponds to local unitary transformation U\ ® 
Ui\^){^j\u[ ® £/j, where the correspondence between U and 
R is given by the isomorphism between the groups SU(2) and 
SO(3): 



Up* = \ 



3 



( 3 

i 

v./=i 



R ijXj 



cr, 



(60) 



Here /?y = Tr(c^ i ■^/c^,'^/' , ') and x, = Tr cr,p. When we apply a 
complete set of local transformations to the states \ift(x)) we 
obtain the whole set of pure states for two qubits. The group 
of transformations is the set of unitary transformations SU(4). 

The set of states from "mirror quantum mechanics" can be 
obtained by applying partial transposition to the set of quan- 
tum states. Formally, partial transposition with respect to sub- 
system 1 is defined by action on a set of product operators: 



PTi(pi ®p 2 ) 



T 

■Pi ' 



>p 2 . 



(61) 



where p\ and p2 are arbitrary operators. Similarly, we can 
define the partial transposition with respect to subsystem 2, 
PT2. To each unitary transformation U in quantum mechan- 
ics we define the corresponding transformation in "mirror 
mechanics", e.g. with respect to subsystem 1: PTif/PTj. 
Therefore, the set of transformations is a conjugate group 
PTiSU(4)PT! := {PTiI/PTi | U e SU(4)}. Note that we could 
equally have chosen to apply partial transposition with respect 
to subsystem 2, and would obtain the same set of states. In 
fact, one can show that PT1C/PT1 = PT 2 t/*PT2, where U* is 
a conjugate unitary transformation (see Lemma 4 in the Ap- 
pendix). Therefore, the two conjugate groups are the same 
PTiSU(4)PTi = PT 2 SU(4)PT 2 . We can generate the set of 
"mirror states" by applying all the transformations PTf/PT 
to some product state, regardless of which particular partial 
transposition is used. 



Now, we show that "mirror mechanics" cannot be consis- 
tently extended to composite systems consisting of three bits. 
Let if/ p = (x, y, z, T\i, T13, T23, ^123) be some product state of 
three bits, where x, y and z are local Bloch vectors, T12, T^, 
T23 and T\2i are two- and three-body correlation tensors, re- 
spectively. We can apply the transformations PTf/yPT to a 
composite system of i and j, and we are free to choose with 
respect to which subsystem (i or j) to take the partial trans- 
position. Furthermore, we can combine transformations in 
12 and 13 subsystems such that the resulting state is genuine 
three-partite entangled, and we can choose to partially trans- 
pose subsystem 2 in both cases. We obtain the transformation 



U 



123 



PT2t/l2PT 2 PT 2 t/23PT2 (62) 

PT 2 t/i2£/23PT 2 . (63) 



U 



123 



to \f/ p we obtain the state 



When we apply 
PTiUi2U23<p P , where <p p = PT2i/' p is again some product state. 
The state UviUi^p is a quantum three qubit state. Since states 
\p p and <p p are product states and do belong to standard quan- 
tum states, we can use the formalism of quantum mechanics 
and denote them as \if/ p ) and \<f> p ). Furthermore, since the state 
\t// p ) is an arbitrary product state, without loss of generality we 
set \<p p ) = |0)|0)|0). We can choose Un and U23 such that: 



E/d0>|0> = |0)|0) 
1 



t/l2|0>|l> 

E/ 23 |0>|0> 

This way we can generate the W-state 



(|0>|1> + |1>|0» 



V2 

^|0>|1>+ J||1>|0>. 
V3 V 3 



(64) 
(65) 

(66) 
(67) 



\W) = t/i 2 C/ 2 3lO>|0>|0> 

= ^(|0>|0>|1) + |0)|1)|0> + |1>|0>|0». (68) 
V3 

When we apply partial transposition with respect to subsystem 
2, we obtain the corresponding "mirror W-state" which we 
denote as WM-state, Wm = PT2W. The local Bloch vectors 
and two-body correlation tensors for the W state are 

x = y = z = (0,0,i) T , (69) 
7 , i 2 = r 13 =r 23 =diag[f,|,-|], (70) 

where |0) corresponds to result +1. Consequently, the local 
Bloch vectors and the correlation tensor for WM-state are 



y = z = (0,0, |) T 



r 12 = r 2 3 = diag[|,-| > -i], 

ri 3 = diag[f,|,-i]. 



(71) 
(72) 
(73) 



The asymmetry in the signs of correlations in the tensors 
T\2, T23 and leads to inconsistencies because they de- 
fine three different reduced states (fry = (x,, Xj, Tij), ij e 
{12, 23, 13), which cannot coexist within a single theory. The 
states if/\2 and 1/^3 belong to "mirror quantum mechanics", 
while the state 1/^3 belongs to ordinary quantum mechanics. 



11 



To see this, take the state ip — (0, 0, diag[-l, -1, 1]) which 
is locally equivalent to state i/'mqm = (0, 0, 1). The overlap 
(measured probability) between the states tffu and \p is nega- 
tive 

12 2 1 1 
Wl3 ) = _(!______) = (74) 

We conclude that "mirror quantum mechanics" - while be- 
ing a perfectly regular solution for a theory of two bits - can- 
not be consistently extended to also describe systems consist- 
ing of many bits. This also answers the question why we find 
in nature only four types of correlations as given in the table 
(Figure 4) on the left, rather than all eight logically possible 
ones. 



VIII. HIGHER-DIMENSIONAL SYSTEMS AND STATE 
UP-DATE RULE IN MEASUREMENT 

Having obtained d — 3 for a two-dimensional system we 
have derived quantum theory of this system. We have also 
reconstructed quantum mechanics of a composite system con- 
sisting of two qubits. Further reconstruction of quantum me- 
chanics can be proceeded as in Hardy's work [19]. In particu- 
lar, the reconstruction of higher-dimensional systems from the 
two-dimensional ones and the general transformations of the 
state after measurement are explicitly given there. We only 
briefly comment on them here. 

In order to derive the state space, measurements and trans- 
formations for a higher-dimensional system, we can use quan- 
tum theory of a two-dimensional system in conjunction with 
axiom 1 . The axiom requires that upon any two linearly inde- 
pendent states one can construct a two-dimensional subspace 
that is isomorphic to the state space of a qubit (2-sphere). The 
state space of a higher dimensional system can be character- 
ized such that if the state is restricted to any given two dimen- 
sional subspace, then it behaves like a qubit. The fact that all 
other (higher-dimensional) systems can be built out of two- 
dimensional ones suggests that the latter can be considered as 
fundamental constituents of the world and gives a justification 
for the usage of the term "elementary system" in the formula- 
tions of the axioms. 

When a measurement is performed and an outcome is ob- 
tain, our knowledge about the state of the system changes and 
its representation in form of the probabilities must be updated 
to be in agreement with the new knowledge acquired in the 
measurement. This is the most natural update rule present in 
any probability theory. Only if one views this change as a 
real physical process conceptual problems arise related to dis- 
continuous and abrupt "collapse of the wave function". There 
is no basis for any such assumption. Associated with each 
outcome is the measurement vector p. When the outcome is 
observed the state after the measurement is updated to p and 
the measurement will be a certain transformation on the ini- 
tial state. Update rules for more general measurements can 
accordingly be given. 



IX. WHAT THE PRESENT RECONSTRUCTION TELLS 
US ABOUT QUANTUM MECHANICS 

It is often said that reconstructions of quantum theory 
within an operational approach are devoid of ontological com- 
mitments, and that nothing can be generally said about the on- 
tological content that arises from the first principles or about 
the status of the notion of realism. As a supporting argument 
one usually notes that within a realistic world view one would 
anyway expect quantum theory at the operational level to be 
deducible from some underlying theory of "deeper reality". 
After all, we have the Broglie-Bohm theory [58] which is a 
nonlocal realistic theory in full agreement with the predictions 
of (non-relativistic) quantum theory. Having said this, we can- 
not but emphasize that realism does stay "orthogonal" to the 
basic idea behind our reconstruction. 

Be it local or nonlocal, realism asserts that outcomes cor- 
respond to actualities objectively existing prior to and inde- 
pendent of measurements. On the other hand, we have shown 
that the finiteness of information carrying capacity of quantum 
systems is an important ingredient in deriving quantum theory. 
This capacity is not enough to allow assignment of definite 
values to outcomes of all possible measurements. The ele- 
mentary system has the information carrying capacity of one 
bit. This is signified by the possibility to decompose any state 
of an elementary system (qubit) in quantum mechanics in two 
orthogonal states. In a realistic theory based on hidden vari- 
ables and an "epistemic constraint" on an observer's knowl- 
edge of the variables' values one can reproduce this feature at 
the level of the entire distribution of the hidden variables f59tl . 
That this is possible is not surprising if one bears in mind that 
hidden-variable theories were at the first place introduced to 
reproduce quantum mechanics and yet give a more complete 
description [67]. But any realism of that kind at the same 
time assumes an infinite information capacity at the level of 
hidden variables. Even to reproduce measurements on a sin- 
gle qubit requires infinitely many orthogonal hidden-variable 
states SEHH. It might be a matter of taste whether or 
not one is ready to work with this "ontological access bag- 
gage" [60] not doing any explanatory work at the operational 
level. But it is certainly conceptually distinctly different from 
the theory analyzed here, in which the information capacity of 
the most elementary systems - those which are by definition 
not reducible further - is fundamentally limited. 

To further clarify our position consider the Mach-Zehnder 
interferometer in which both the path information and inter- 
ference observable are dichotomic, i.e. two-valued observ- 
ables. It is meaningless to speak about "the path the parti- 
cle took in the interferometer in the interference experiment" 
because this would already require to assign 2 bits of infor- 
mation to the system, which would exceed its information 
capacity of 1 bit 16311 . The information capacity of the sys- 
tem is simply not enough to provide definite outcomes to all 
possible measurements. Then, by necessity the outcome in 
some experiments must contain an element of randomness and 
there must be observables that are complementarity to each 
other. Entanglement and consequently the violation of Bell's 
inequality (and thus of local realism) arise from the possibility 
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to define an abstract elementary system carrying at most one 
bit such that correlations ("00" and "11" in a joint measure- 
ment of two subsystems) are basis states. 

X. CONCLUSIONS 

Quantum theory is our most accurate description of nature 
and is fundamental to our understanding of, for example, the 
stability of matter, the periodic table of chemical elements, 
and the energy of the sun. It has led to the development 
of great inventions like the electronic transistor, the laser, or 
quantum cryptography. Given the enormous success of quan- 
tum theory, can we consider it as our final and ultimate theory? 
Quantum theory has caused much controversy in interpreting 
what its philosophical and epistemological implications are. 
At the heart of this controversy lies the fact that the theory 
makes only probabilistic predictions. In recent years it was 
however shown that some features of quantum theory that one 
might have expected to be uniquely quantum, turned out to be 
highly generic for generalized probabilistic theories. Is there 
any reason why the universe should obey the laws of quantum 
theory, as opposed to any other possible probabilistic theory? 

In this work we have shown that classical probability the- 
ory and quantum theory - the only two probability theories 
for which we have empirical evidences — are special in a 
way that they fulfill three reasonable axioms on the systems' 
information carrying capacity, on the notion of locality and 
on the reversibility of transformations. The two theories can 
be separated if one restricts the transformations between the 
pure states to be continuous fl^l . An interesting finding is 
that quantum theory is the only non-classical probability the- 
ory that can exhibit entanglement without conflicting one or 
more axioms. Therefore - to use Schrodinger's words ll64ll 
- entanglement is not only "the characteristic trait of quan- 
tum mechanics, the one that enforces its entire departure from 
classical lines of thought", but also the one that enforces the 
departure from a broad class of more general probabilistic the- 
ories. 



Acknowledgments 

We thank M. Aspelmeyer, J. Kofler, T. Paterek and A. 
Zeilinger for discussions. We acknowledge support from 
the Austrian Science Foundation FWF within Project No. 
P19570-N16, SFB and CoQuS No. W1210-N16, the Euro- 
pean Commission Project QAP (No. 015848) and the Foun- 
dational Question Institute (FQXi). 



XI. APPENDIX 

In this appendix we give the proofs of the lemmas from the 
main text. 

Lemma 1. The lower bound \\T\\ — 1 is saturated, if and only 
if the state is a product state T — xy T . 



Proof. If the state is a product state then ||r|| 2 — ||x|| 2 ||y|| 2 = 
1 . On the other hand, assume that the state if/ = (x, y, T) sat- 
isfies ||r|| = 1. Normalization d23l gives ||x|| = ||y|| — 1. 
Let <p p = (-x, -y, To = xy T ) be a product state. We have 
P(tf/, 4> p ) > and therefore 

i - llxll 2 - llyll 2 + Tr(r T r ) = -l + Tr(r T r () ) > o. (75) 

The last inequality Tr(r T 7o) > 1 can be seen as (T, Tq) > 1 
where (, ) is the scalar product in Hilbert-Schmidt space. Since 
the vectors T, Tq are normalized, ||r|| = ||7o|| = 1, the scalar 
product between them is always (T, To) < 1. Therefore, we 
have {T, Tq) = 1 which is equivalent to T — To = xy T . 

QED 

Lemma 2. The only product states belonging to S n ore 
and i]/2. 

Proof. Let tf/ p = (x, y, xy T ) 6 S n. We have 

1 = Pii(ffr p ,ifri) + Pi2(ffr p ,ifri) (76) 
1 

= -(1 + xe, + yd + (xeiXyeO) (77) 

+ -(l-xei-yei-KxeiXyeO) (78) 

= ^(1 + («i)(yei)) (79) 
=> xei = yei = 1 v xei = yei = -1 (80) 
o x = y = ei vx = y = -ei. (81) 

QED 

Lemma 3. If the state ifr e 5 12, then t// = (R, l)if/ e S 34 and 
i//" = (l,R)i//eS M . 

Proof. If tft e S 12 we have 

1 = + Puty^z) (82) 

= P l2 ((R, 1)<A, (R, l¥i) + Pn((R, m, (R, 1)^2) (83) 
= P n (iy, ^ 3 ) + *W,^ 4 ). (84) 
Similarly, one can show that (l,R)i[r e S34. 

QED 

Lemma 4. Let U be some operator with the following action 
in the Hilbert-Schmidt space; U(p) — UpU\ and PT\ and 
PT2 are partial transpositions with respect to subsystems 1 
and 2, respectively. The following identity holds: PTjf/PT! = 
PT2f/*PT2, where U* is the complex-conjugate operator. 

Proof. We can expand U into some product basis in the 
Hilbert-Schmidt space U = u ij^i ® Pj- We have 

PTit/PT 1 (pi®p 2 ) = YY l {Up\®p 2 tf} (85) 

ijkl 

= PT 2 {J] UijU kI (Alp x A]) ® (B* lP jB T j)} 

ijkl 

= PT 2 {^ u kl u u (Al ® B*)( Pl ®p T 2 )(Aj ® B])} 
ijkl 

= PT 2 [/*PT 2 (pi ®p2), 
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for arbitrary operators p\ and p2- 

QED 
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